Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

نویسندگان

Haoyue Shi

Caihua Li

Junfeng Hu

چکیده

Previous researches have shown that learning multiple representations for polysemous words can improve the performance of word embeddings on many tasks. However, this leads to another problem. Several vectors of a word may actually point to the same meaning, namely pseudo multi-sense. In this paper, we introduce the concept of pseudo multi-sense, and then propose an algorithm to detect such cases. With the consideration of the detected pseudo multi-sense cases, we try to refine the existing word embeddings to eliminate the influence of pseudo multi-sense. Moreover, we apply our algorithm on previous released multi-sense word embeddings and tested it on artificial word similarity tasks and the analogy task. The result of the experiments shows that diminishing pseudo multi-sense can improve the quality of word representations. Thus, our method is actually an efficient way to reduce linguistic complexity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding and Improving Multi-Sense Word Embeddings via Extended Robust Principal Component Analysis

Unsupervised learned representations of polysemous words generate a large of pseudo multi senses since unsupervised methods are overly sensitive to contextual variations. In this paper, we address the pseudo multi-sense detection for word embeddings by dimensionality reduction of sense pairs. We propose a novel principal analysis method, termed ExRPCA, designed to detect both pseudo multi sense...

متن کامل

Do Multi-Sense Embeddings Improve Natural Language Understanding?

Learning a distinct representation for each sense of an ambiguous word could lead to more powerful and fine-grained models of vector-space representations. Yet while ‘multi-sense’ methods have been proposed and tested on artificial wordsimilarity tasks, we don’t know if they improve real natural language understanding tasks. In this paper we introduce a multisense embedding model based on Chine...

متن کامل

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by exploiting crosslingual signals to aid sense identification. We present a multi-view Bayesian non-parametric algorithm which improves multi-sense word embeddings by...

متن کامل

One Representation per Word - Does it make Sense for Composition?

In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf singlevector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vec...

متن کامل

On Modeling Sense Relatedness in Multi-prototype Word Embedding

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel appro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

نویسندگان

چکیده

منابع مشابه

Understanding and Improving Multi-Sense Word Embeddings via Extended Robust Principal Component Analysis

Do Multi-Sense Embeddings Improve Natural Language Understanding?

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

One Representation per Word - Does it make Sense for Composition?

On Modeling Sense Relatedness in Multi-prototype Word Embedding

عنوان ژورنال:

اشتراک گذاری